Biosynethetic gene cluster for jerangolids

ABSTRACT

Domains of jerangolid polyketide synthase and modification enzymes and polynucleotides encoding them are provided. Methods to prepare jerangolid in pharmaceutically useful quantities are described, as are methods to prepare jerangolid analogs and other polyketides using the polynucleotides encoding jerangolid synthase domains or modifying enzymes.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. patent application Ser. No.11/109,593, filed 18 Apr. 2005, now U.S. Pat. No. 7,285,405, issued 23Oct. 2007, which claims benefit under 35 U.S.C. §119 to U.S. provisionalapplication Ser. No. 60/563,843, filed 19 Apr. 2005, the entire contentsof each prior application being incorporated herein by reference.

Polyketides are complex natural products that are produced bymicroorganisms such as fungi and mycelial bacteria. There are about10,000 known polyketides, from which numerous pharmaceutical products inmany therapeutic areas have been derived, including: adriamycin,epothilone, erythromycin, mevacor, rapamycin, tacrolimus, tetracycline,rapamycin, and many others. However, polyketides are made in very smallamounts in microorganisms and are difficult to make or modifychemically. For this and other reasons, biosynthetic methods arepreferred for production of therapeutically active polyketides. See PCTpublication Nos. WO 93/13663; WO 95/08548; WO 96/40968; WO 97/02358; andWO 98/27203; U.S. Pat. Nos. 4,874,748; 5,063,155; 5,098,837; 5,149,639;5,672,491; 5,712,146 and 6,410,301; Fu et al., 1994, Biochemistry33:9321-26; McDaniel et al., 1993, Science 262: 1546-1550; Kao et al.,1994, Science, 265:509-12, and Rohr, 1995, Angew. Chem. Int. Ed. Engl.34: 881-88, each of which is incorporated herein by reference.

Biosynthesis of polyketides may be accomplished by heterologousexpression of Type I or modular polyketide synthase enzymes (PKSs). TypeI PKSs are large multifunctional protein complexes, the proteincomponents of which are encoded by multiple open reading frames (ORF) ofPKS gene clusters. Each ORF of a Type I PKS gene cluster can encode one,two, or more modules of ketosynthase activity. Each module activates andincorporates a two-carbon (ketide) unit into the polyketide backbone.Each module also contains multiple ketide-modifying enzymaticactivities, or domains. In classical Type I PKSs, the number and orderof modules, and the types of ketide-modifying domains within eachmodule, determine the structure of the resulting product. Recently,variants of Type I PKSs have been found in which single modules may beused in an iterative fashion to add more than one two-carbon unit to thegrowing polyketide chain (see, for example, Müller 2004). Polyketidesynthesis may also involve the activity of nonribosomal peptidesynthetases (NRPSs) to catalyze incorporation of an amino acid-derivedbuilding block into the polyketide, as well as post-synthesismodification, or tailoring enzymes. The modification enzymes modify thepolyketide by oxidation or reduction, addition of carbohydrate groups ormethyl groups, or other modifications.

In PKS polypeptides, the regions that encode enzymatic activities(domains) are separated by linker regions. These regions collectivelycan be considered to define boundaries of the various domains.Generally, this organization permits PKS domains of different oridentical substrate specificities to be substituted (usually at thelevel of encoding DNA) from other PKSs by various availablemethodologies. Using this method, new polyketide synthases (whichproduce novel polyketides) can be produced. It will be recognized fromthe foregoing that genetic manipulation of PKS genes and heterologousexpression of PKSs can be used for the efficient production of knownpolyketides, and for production of novel polyketides structurallyrelated to, but distinct from, known polyketides (see references above,and Hutchinson, 1998, Curr. Opin. Microbiol. 1:319-29; Carreras andSanti, 1998, Curr. Opin. Biotech. 9:403-11; and U.S. Pat. Nos. 5,712,146and 5,672,491, each of which is incorporated herein by reference).

One valuable class of polyketides includes the jerangolids and theiranalogs (FIG. 1), produced by various strains of the myxobacteriumSorangium cellulosum. Jerangolid A (1) as produced by Sorangiumcellulosum strain So ce 307 was described by Gerth et al. “TheJerangolids: A Family of New Antifungal Compounds from Sorangiumcellulosum (Myxobacteria); Production, Pysico-chemical and BiologicalProperties of Jerangolid A,” J. Antibiotics 49: 71-75 (1996), along withfour closely related analogs, jerangolids B, C, D, and E.

The jerangolids are anti-fungal agents showing partial structuralresemblance with the ambruticins.

Given the promise of jerangolids in the treatment of fungal infections,there exists an unmet need for a production system that can providelarge quantities of these polyketides. The present invention meets thisneed by providing the biosynthetic genes responsible for the productionof jerangolids and providing for their expression in heterologous hosts.

SUMMARY OF THE INVENTION

The present invention provides recombinant nucleic acids encodingpolyketide synthases and polyketide modification enzymes. Therecombinant nucleic acids of the invention are useful in the productionof polyketides, including but not limited to jerangolids and jerangolidanalogs and derivatives in recombinant host cells. The biosynthesis ofthe jerangolids is performed by a modular polyketide synthase (PKS)together with polyketide modification enzymes. The jerangolid PKS ismade up of several proteins, each having one or more modules. Themodules have domains with specific synthetic functions.

The present invention also provides domains and modules of thejerangolid PKS and corresponding nucleic acid sequences encoding themand/or parts thereof. Such compounds are useful in the production ofhybrid PKS enzymes and the recombinant genes that encode them.

The present invention also provides modifying genes of the jerangolidbiosynthetic gene cluster, including but not limited to isolated andrecombinant forms and forms incorporated into a vector or thechromosomal DNA of a host cell.

The present invention also provides recombinant host cells that containthe nucleic acids of the invention. In one embodiment, the host cellprovided by the invention is a Streptomyces host cell that produces ajerangolid modification enzyme and/or a domain, module, or protein ofthe jerangolid PKS. Methods for the genetic manipulation of Streptomycesare described in Kieser et al, “Practical Streptomyces Genetics,” TheJohn Innes Foundation, Norwich (2000), which is incorporated herein byreference in its entirety. In other embodiments, the host cells providedby the invention are eubacterial cells such as Escherichia coli, yeastcells such as Saccharomyces cerevisiae, or myxobacterial cells such asMyxococcus xanthus.

Accordingly, there is provided a recombinant PKS wherein at least 10,15, 20, or more consecutive amino acids in one or more domains of one ormore modules thereof are derived from one or more domains of one or moremodules of the jerangolid polyketide synthase. Preferably at least anentire domain of a module of the jerangolid synthase is included.Representative jerangolid PKS domains useful in this aspect of theinvention include, for example, KR, DH, ER, AT, ACP and KS domains. Inone embodiment of the invention, the PKS is assembled from polypeptidesencoded by DNA molecules that comprise coding sequences for PKS domains,wherein at least one encoded domain corresponds to a domain ofjerangolid PKS. In such DNA molecules, the coding sequences are operablylinked to control sequences so that expression therefrom in host cellsis effective. In this manner, jerangolid PKS coding sequences or modulesand/or domains can be made to encode PKS to biosynthesize compoundshaving antibiotic or other useful bioactivity other than jerangolid.

These and other aspects of the present invention are described in moredetail in the Detailed Description of the Invention, below.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the chemical structure of Jerangolid A

FIG. 2 shows the organization of the jerangolid biosynthetic cluster asdeduced from SEQ ID NO:1. FIG. 2A shows the organization of the portionof the gene cluster upstream of the polyketide synthase genes. FIG. 2Bshows the organization of the portion of the gene cluster containing thepolyketide synthase genes. FIG. 2C shows the organization of the portionof the gene cluster downstream of the polyketide synthase genes.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides recombinant materials for the productionof polyketides. In an aspect, the invention provides recombinant nucleicacids encoding at least one domain of a polyketide synthase required forjerangolid biosynthesis. Methods and host cells for using these genes toproduce a polyketide in recombinant host cells are also provided.

The nucleotide sequences encoding jerangolid PKS domains, modules andpolypeptides of the present invention were isolated from Sorangiumcellulosum So ce 307 as described in Example 1. Given the valuableproperties of jerangolid and its derivatives and analogs, means toproduce useful quantities of these molecules in a highly pure form is ofgreat potential value. The compounds produced may be used as antitumoragents or for other therapeutic uses, and/or intermediates for furtherenzymatic or chemical modification. The nucleotide sequences of thejerangolid biosynthetic gene cluster encoding domains, modules andpolypeptides of jerangolid synthase, and modifying enzymes, and otherpolypeptides can be used, for example, to make both known and novelpolyketides.

In one aspect of the invention, purified and isolated DNA molecules areprovided that comprise one or more coding sequences for one or moredomains or modules of jerangolid synthase. Examples of such encodeddomains include jerangolid synthase KR, DH, ER, AT, ACP, and KS domains.Domains will herein be referred to according to the module in which theyare found as “domain(module)”; for example, the module 1 AT domain willbe referred to as “AT(1).” In one aspect, the invention provides DNAmolecules in which sequences encoding one or more polypeptides ofjerangolid synthase are operably linked to expression control sequencesthat are effective in suitable host cells to produce jerangolid, itsanalogs or derivatives, or novel polyketides.

The sequence of the beginning of the jerangolid PKS gene cluster wasassembled from sequences deduced from the cosmid 10K10B3 (FIG. 2) and isshown as SEQ ID NO:1. This partial PKS gene cluster is found to comprisefive open reading frames (ORFs), named jerA, jerB, jerC, jerD, and jerE.The jerA gene encodes the loading module of the jerangolid PKS, alsoreferred to herein as “module 0,” and comprises KS and AT domains. TheKS(0) domain is apparently inactive as a ketosynthase, having the activesite cysteine residue replaced with a serine, and is thought to act as adecarboxylase to prime the PKS with a propionate group derived frommethylmalonate. The AT(0) domain comprises the signature amino acidsequences (GHSQ and YASH) of a methylmalonyl-specific AT domain. ThejerB gene encodes modules 1 and 2 of the jerangolid PKS, the jerC geneencodes modules 3 and 4, the jerD gene encodes module 5, and the jerEgene encodes modules 6 and 7 along with a chain terminating thioesterase(TE) domain. Table 1 provides a description of the genes, modules, anddomains of the five jerangolid PKS proteins. A further gene, jerF,encodes an O-methyltransferase thought to be involved in addition of themethyl group to O-3 of jerangolide.

TABLE 1 Genes, modules, and domains of the five proteins of thejerangolid PKS determined from the nucleotide sequence given in SEQ IDNO: 1. Gene Module Domain boundaries JerA 15751-18978 module 015859-18831 KS(0) 15859-17133 AT(0) 17461-18513 ACP(0) 18577-18831 JerB19013-30074 module 1 19134-23507 KS(1) 19134-20408 AT(1) 20715-21767KR(1) 22398-23219 ACP(1) 23250-23507 module 2 23559-29816 KS(2)23559-24836 AT(2) 25167-26234 DH(2) 26268-26819 ER(2) 27822-28697 KR(2)28707-29522 ACP(2) 29559-29816 JerC 30071-41035 module 3 30170-35440KS(3) 30170-31447 AT(3) 31772-32824 DH(3) 32858-33409 KR(3) 34322-35161ACP(3) 35183-35440 module 4 35507-40789 KS(4) 35507-36784 AT(4)37115-38182 DH(4) 38216-38776 KR(4) 39695-40519 ACP(4) 40532-40789 JerD41032-46674 module 5 41131-46416 KS(5) 41131-42408 AT(5) 42733-43800DH(5) 43834-44430 KR(5) 45307-46125 ACP(5) 46159-46416 JerE 46671-55280module 6 46773-51383 KS(6) 46773-48050 AT(6) 48381-49448 KR(6)50295-50960 ACP(6) 51126-51383 module 7 51462-54443 KS(7) 51462-52742AT(7) 53052-54098 ACP(7) 54189-54443 TE 54444-55280

In one aspect, the invention provides an isolated or recombinant DNAmolecule comprising a nucleotide sequence that encodes at least onedomain, alternatively at least one module, alternatively at least onepolypeptide, involved in the biosynthesis of an jerangolid.

In one aspect, the invention provides an isolated or recombinant DNAmolecule comprising a sequence identical or substantially similar to SEQID NO:1 or its complement. Hereinafter, each reference to a nucleic acidsequence is also intended to refer to and include the complementarysequence, unless otherwise stated or apparent from context. In anembodiment the subsequence comprises a sequence encoding a completejerangolid PKS domain, module or polypeptide.

In one aspect, the present invention provides an isolated or recombinantDNA molecule comprising a nucleotide sequence that encodes an openreading frame, module or domain having an amino acid sequence identicalor substantially similar to an ORF, module or domain encoded by SEQ IDNO: 1. Generally, a polypeptide, module or domain having a sequencesubstantially similar to a reference sequence has substantially the sameactivity as the reference protein, module or domain (e.g., whenintegrated into an appropriate PKS framework using methods known in theart). In certain embodiments, one or more activities of a substantiallysimilar polypeptide, module or domain are modified or inactivated asdescribed below.

In one aspect, the invention provides an isolated or recombinant DNAmolecule comprising a nucleotide sequence that encodes at least onepolypeptide, module or domain encoded by SEQ ID NO:1, e.g., apolypeptide, module or domain involved in the biosynthesis of anjerangolid, wherein said nucleotide sequence comprises at least 10, 20,25, 30, 35, 40, 45, or 50 contiguous base pairs identical to a sequenceof SEQ ID NO: 1. In one aspect, the invention provides an isolated orrecombinant DNA molecule comprising a nucleotide sequence that encodesat least one polypeptide, module or domain encoded by SEQ ID NO:1, e.g.,a polypeptide, module or domain involved in the biosynthesis of ajerangolid, wherein said polypeptide, module or domain comprises atleast 10, 15, 20, 30, or 40 contiguous residues of a correspondingpolypeptide, module or domain comprising a sequence of SEQ ID NO: 1.

It will be understood that SEQ ID NO: 1 was determined using the insertsof cosmids 307K-3F11, 307K-5G2, and 307K-2C8. Accordingly, the inventionprovides an isolated or recombinant DNA molecule comprising a sequenceidentical or substantially similar to an ORF encoding sequence of theinsert of cosmids 307K-3F11, 307K-5G2, or 307K-2C8.

Those of skill will recognize that, due to the degeneracy of the geneticcode, a large number of DNA sequences encode the amino acid sequences ofthe domains, modules, and proteins of the jerangolid PKS, the enzymesinvolved in jerangolid modification and other polypeptides encoded bythe genes of the jerangolid biosynthetic gene cluster. The presentinvention contemplates all such DNAs. For example, it may beadvantageous to optimize sequence to account for the codon preference ofa host organism. The invention also contemplates naturally occurringgenes encoding the jerangolid PKS that are polymorphic or othervariants.

As used herein, the terms “substantial identity,” “substantial sequenceidentity,” or “substantial similarity” in the context of nucleic acids,refers to a measure of sequence similarity between two polynucleotides.Substantial sequence identity can be determined by hybridization understringent conditions, by direct comparison, or other means. For example,two polynucleotides can be identified as having substantial sequenceidentity if they are capable of specifically hybridizing to each otherunder stringent hybridization conditions. Other degrees of sequenceidentity (e.g., less than “substantial”) can be characterized byhybridization under different conditions of stringency. “Stringenthybridization conditions” refers to conditions in a range from about 5°C. to about 20° C. or 25° C. below the melting temperature (Tm) of thetarget sequence and a probe with exact or nearly exact complementarityto the target. As used herein, the melting temperature is thetemperature at which a population of double-stranded nucleic acidmolecules becomes half-dissociated into single strands. Methods forcalculating the Tm of nucleic acids are well known in the art (see,e.g., Berger and Kimmel, 1987, Methods In Enzymology, Vol. 152: Guide ToMolecular Cloning Techniques, San Diego: Academic Press, Inc. andSambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 2nd Ed.,Vols. 1-3, Cold Spring Harbor Laboratory). Typically, stringenthybridization conditions for probes greater than 50 nucleotides are saltconcentrations less than about 1.0 M sodium ion, typically about 0.01 to1.0 M sodium ion at pH 7.0 to 8.3, and temperatures at least about 50°C., preferably at least about 60° C. As noted, stringent conditions mayalso be achieved with the addition of destabilizing agents such asformamide, in which case lower temperatures may be employed. Exemplaryconditions include hybridization at 7% sodium dodecyl sulfate (SDS), 0.5M NaPO₄ pH 7.0, 1 mM EDTA at 65° C.; wash with 2×SSC, 1% SDS, at 50° C.

Alternatively, substantial sequence identity can be described as apercentage identity between two nucleotide or amino acid sequences. Twonucleic acid sequences are considered substantially identical when theyare at least about 70% identical, or at least about 80% identical, or atleast about 90% identical, or at least about 95% or 98% identical. Twoamino acid sequences are considered substantially identical when theyare at least about 60%, sequence identical, more often at least about70%, at least about 80%, or at least about 90% sequence identity to thereference sequence. Percentage sequence (nucleotide or amino acid)identity is typically calculated using art known means to determine theoptimal alignment between two sequences and comparing the two sequences.Optimal alignment of sequences may be conducted using the local homologyalgorithm of Smith and Waterman (1981) Adv. Appl. Math. 2: 482, by thehomology alignment algorithm of Needleman and Wunsch (1970) J. Mol.Biol. 48: 443, by the search for similarity method of Pearson and Lipman(1988) Proc. Natl. Acad. Sci. U.S.A. 85: 2444, by the BLAST algorithm ofAltschul (1990) J. Mol. Biol. 215: 403-410; and Shpaer (1996) Genomics38:179-191, or by the Needleham et al. (1970) J. Mol. Biol. 48: 443-453;and Sankoff et al., 1983, Time Warps, String Edits, and Macromolecules,The Theory and Practice of Sequence Comparison, Chapter One,Addison-Wesley, Reading, Mass.; generally by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe Wisconsin Genetics Software Package, Genetics Computer Group, 575Science Dr., Madison, Wis.; BLAST from the National Center forBiotechnology Information (http://www.ncbi.nlm.nih.gov/). In each casedefault parameters are used (for example the BLAST program uses asdefaults a wordlength (W) of 11, the BLOSUM62 scoring matrix (seeHenikoff (1992) Proc. Natl. Acad. Sci. USA 89: 10915-10919) alignments(B) of 50, expectation (E) of 10, M=5, N=−4, and a comparison of bothstrands).

The invention methods may be directed to the preparation of anindividual polyketide. The polyketide may or may not be novel, but themethod of preparation permits a more convenient or alternative method ofpreparing it. The resulting polyketides may be further modified toconvert them to other useful compounds. Examples of chemical structuresof that can be made using the materials and methods of the presentinvention include known analogs, such as those described in Kalesse &Christmann, 2002, “The Chemistry and Biology of the Jerangolid Family”Synthesis (8):981-1003 and the refereneces cited therein, and novelmolecules produced by modified or chimeric PKSs comprising a portion ofthe jerangolid PKS sequence, molecules produced by the action ofpolyketide modifying enzymes from the jerangolid PKS cluster on productsof other PKSs, molecules produced by the action on products of thejerangolid PKS of polyketide modifying enzymes from other PKSs, and thelike. As noted, in one aspect the invention provides recombinant PKSwherein at least 10, 15, 20, or more consecutive amino acids in one ormore domains of one or more modules thereof are derived from one or moredomains of one or more modules of the jerangolid polyketide synthase. Apolyketide synthase “derived from” a naturally occurring PKS containsthe scaffolding encoded by all the portion employed of the naturallyoccurring synthase gene, contains at least two modules that arefunctional, and contains mutations, deletions, or replacements of one ormore of the activities of these functional modules so that the nature ofthe resulting polyketide is altered. This definition applies both at theprotein and genetic levels. Particular embodiments include those whereina KS, AT, KR, DH, or ER has been deleted or replaced by a version of theactivity from a different PKS or from another location within the samePKS, and derivatives where at least one noncondensation cycle enzymaticactivity (KR, DH, or ER) has been deleted or wherein any of theseactivities has been added or mutated so as to change the ultimatepolyketide synthesized. There are at least five degrees of freedom forconstructing a polyketide synthase in terms of the polyketide that willbe produced. See, U.S. Pat. No. 6,509,455 for a discussion.

As can be appreciated by those skilled in the art, polyketidebiosynthesis can be manipulated to make a product other than the productof a naturally occurring PKS biosynthetic cluster. For example, ATdomains can be altered or replaced to change specificity. The variabledomains within a module can be deleted and or inactivated or replacedwith other variable domains found in other modules of the same PKS orfrom another PKS. See e.g., Katz & McDaniel, Med Res Rev 19: 543-558(1999) and WO 98/49315. Similarly, entire modules can be deleted and/orreplaced with other modules from the same PKS or another PKS. See e.g.,Gokhale et al., Science 284: 482 (1999) and WO 00/47724 each of whichare incorporated herein by reference. Protein subunits of different PKSsalso can be mixed and matched to make compounds having the desiredbackbone and modifications. For example, subunits of 1 and 2 (encodingmodules 1-4) of the pikromycin PKS were combined with the DEBS3 subunitto make a hybrid PKS product (see Tang et al., Science, 287: 640 (2001),WO 00/26349 and WO 99/6159). Mutations can be introduced into PKS genessuch that polypeptides with altered activity are encoded. Polypeptideswith “altered activity” include those in which one or more domains areinactivated or deleted, or in which a mutation changes the substratespecificity of a domain, as well as other alterations in activity.Mutations can be made to the native sequences using conventionaltechniques. The substrates for mutation can be an entire cluster ofgenes or only one or two of them; the substrate for mutation may also beportions of one or more of these genes. Techniques for mutation includepreparing synthetic oligonucleotides including the mutations andinserting the mutated sequence into the gene encoding a PKS subunitusing restriction endonuclease digestion. (See, e.g., Kunkel, T. A. ProcNatl Acad Sci USA (1985) 82:448; Geisselsoder et al. BioTechniques(1987) 5:786.) Alternatively, the mutations can be effected using amismatched primer (generally 10-20 nucleotides in length) thathybridizes to the native nucleotide sequence (generally cDNAcorresponding to the RNA sequence), at a temperature below the meltingtemperature of the mismatched duplex. The primer can be made specific bykeeping primer length and base composition within relatively narrowlimits and by keeping the mutant base centrally located. (See Zoller andSmith, Methods in Enzymology (1983) 100:468). Primer extension iseffected using DNA polymerase. The product of the extension reaction iscloned, and those clones containing the mutated DNA are selected.Selection can be accomplished using the mutant primer as a hybridizationprobe. The technique is also applicable for generating multiple pointmutations. (See, e.g., Dalbie-McFarland et al. Proc Natl Acad Sci USA(1982) 79:6409). PCR mutagenesis can also be used for effecting thedesired mutations. Random mutagenesis of selected portions of thenucleotide sequences encoding enzymatic activities can be accomplishedby several different techniques known in the art, e.g., by inserting anoligonucleotide linker randomly into a plasmid.

In addition to providing mutated forms of regions encoding enzymaticactivity, regions encoding corresponding activities from different PKSsynthases or from different locations in the same PKS synthase can berecovered, for example, using PCR techniques with appropriate primers.By “corresponding” activity encoding regions is meant those regionsencoding the same general type of activity—e.g., a ketoreductaseactivity in one location of a gene cluster would “correspond” to aketoreductase-encoding activity in another location in the gene clusteror in a different gene cluster; similarly, a complete reductase cyclecould be considered corresponding—e.g., KR/DH/ER could correspond to KRalone.

If replacement of a particular target region in a host polyketidesynthase is to be made, this replacement can be conducted in vitro usingsuitable restriction enzymes or can be effected in vivo usingrecombinant techniques involving homologous sequences framing thereplacement gene. One such system involving plasmids of differingtemperature sensitivities is described in PCT application WO 96/40968.Another useful method for modifying a PKS gene (e.g., making domainsubstitutions or “swaps”) is a RED/ET cloning procedure developed forconstructing domain swaps or modifications in an expression plasmidwithout first introducing restriction sites. The method is related to ETcloning methods (see, Datansko & Wanner, 2000, Proc. Natl. Acad. Sci.U.S.A. 97, 6640-45; Muyrers et al, 2000, Genetic Engineering 22:77-98).The RED/ET cloning procedure is used to introduce a unique restrictionsite in the recipient plasmid at the location of the targeted domain.This restriction site is used to subsequently linearize the recipientplasmid in a subsequent ET cloning step to introduce the modification.This linearization step is necessary in the absence of a selectablemarker, which cannot be used for domain substitutions. An advantage ofusing this method for PKS engineering is that restriction sites do nothave to be introduced in the recipient plasmid in order to construct theswap, which makes it faster and more powerful because boundary junctionscan be altered more easily.

In a further aspect, the invention provides methods for expressingchimeric or hybrid PKSs and products of such PKSs. For example, theinvention provides (1) encoding DNA for a chimeric PKS that issubstantially patterned on a non-jerangolid producing enzyme, but whichincludes one or more functional domains, modules or polypeptides ofjerangolid PKS; and (2) encoding DNA for a chimeric PKS that issubstantially patterned on the jerangolid PKS, but which includes one ormore functional domains, modules, or polypeptides of another PKS orNRPS.

With respect to item (1) above, in one embodiment, the inventionprovides chimeric PKS enzymes in which the genes for a non-jerangolidPKS function as accepting genes, and one or more of the above-identifiedcoding sequences for jerangolid domains or modules are inserted asreplacements for one or more domains or modules of comparable function.Construction of chimeric molecules is most effectively achieved byconstruction of appropriate encoding polynucleotides. In making achimeric molecule, it is not necessary to replace an entire domain ormodule accepting of the PKS with an entire domain or module ofjerangolid PKS: subsequences of a PKS domain or module that correspondto a peptide subsequence in an accepting domain or module, or whichotherwise provide useful function, may be used as replacements.Accordingly, appropriate encoding DNAs for construction of such chimericPKS include those that encode at least 10, 15, 20 or more amino acids ofa selected jerangolid domain or module.

Recombinant methods for manipulating modular PKS genes to make chimericPKS enzymes are described in U.S. Pat. Nos. 5,672,491; 5,843,718;5,830,750; and 5,712,146; and in PCT publication Nos. 98/49315 and97/02358. A number of genetic engineering strategies have been used withDEBS to demonstrate that the structures of polyketides can bemanipulated to produce novel natural products, primarily analogs of theerythromycins (see the patent publications referenced supra andHutchinson, 1998, Curr Opin Microbiol. 1:319-329, and Baltz, 1998,Trends Microbiol. 6:76-83). In one embodiment, the components of thechimeric PKS are arranged onto polypeptides having interpolypeptidelinkers that direct the assembly of the polypeptides into the functionalPKS protein, such that it is not required that the PKS have the samearrangement of modules in the polypeptides as observed in natural PKSs.Suitable interpolypeptide linkers to join polypeptides andintrapolypeptide linkers to join modules within a polypeptide aredescribed in PCT publication WO 00/47724.

A partial list of sources of PKS sequences for use in making chimericmolecules, for illustration and not limitation, includes Avermectin(U.S. Pat. No. 5,252,474; MacNeil et al., 1993, IndustrialMicroorganisms: Basic and Applied Molecular Genetics, Baltz, Hegeman, &Skatrud, eds. (ASM), pp. 245-256; MacNeil et al., 1992, Gene 115:119-25); Candicidin (FRO008) (Hu et al., 1994, Mol. Microbiol. 14:163-72); Epothilone (U.S. Pat. No. 6,303,342); Erythromycin (WO93/13663; U.S. Pat. No. 5,824,513; Donadio et al., 1991, Science252:675-79; Cortes et al., 1990, Nature 348:176-8); FK-506 (Motamedi etal., 1998, Eur. J. Biochem. 256:528-34; Motamedi et al., 1997, Eur. J.Biochem. 244:74-80); FK-520 (U.S. Pat. No. 6,503,737; see also Nielsenet al., 1991, Biochem. 30:5789-96); Lovastatin (U.S. Pat. No.5,744,350); Nemadectin (MacNeil et al., 1993, supra); Niddamycin(Kakavas et al., 1997, J. Bacteriol. 179:7515-22); Oleandomycin (Swan etal., 1994, Mol. Gen. Genet. 242:358-62; U.S. Pat. No. 6,388,099; Olanoet al., 1998, Mol. Gen. Genet. 259:299-308); Platenolide (EP Pat. App.791,656); Rapamycin (Schwecke et al., 1995, Proc. Natl. Acad. Sci. USA92:7839-43); Aparicio et al., 1996, Gene 169:9-16); Rifamycin (August etal., 1998, Chemistry & Biology, 5: 69-79); Soraphen (U.S. Pat. No.5,716,849; Schupp et al., 1995, J. Bacteriology 177: 3673-79);Spiramycin (U.S. Pat. No. 5,098,837); Tylosin (EP 0 791,655; Kuhstoss etal., 1996, Gene 183:231-36; U.S. Pat. No. 5,876,991). Additionalsuitable PKS coding sequences remain to be discovered and characterized,but will be available to those of skill (e.g., by reference to GenBank).

The jerangolid PKS-encoding polynucleotides of the invention may also beused in the production of libraries of PKSs (i.e., modified and chimericPKSs comprising at least a portion of the jerangolid PKS sequence. Theinvention provides libraries of polyketides by generating modificationsin, or using a portion of, the jerangolid PKS so that the proteincomplexes produced by the cluster have altered activities in one or morerespects, and thus produce polyketides other than the natural jerangolidproduct of the PKS. Novel polyketides may thus be prepared, orpolyketides in general prepared more readily, using this method. Byproviding a large number of different genes or gene clusters derivedfrom a naturally occurring PKS gene cluster, each of which has beenmodified in a different way from the native PKS cluster, an effectivelycombinatorial library of polyketides can be produced as a result of themultiple variations in these activities. Expression vectors containingnucleotide sequences encoding a variety of PKS systems for theproduction of different polyketides can be transformed into theappropriate host cells to construct a polyketide library. In oneapproach, a mixture of such vectors is transformed into the selectedhost cells and the resulting cells plated into individual colonies andselected for successful transformants. Each individual colony has theability to produce a particular PKS synthase and ultimately a particularpolyketide. A variety of strategies can be devised to obtain amultiplicity of colonies each containing a PKS gene cluster derived fromthe naturally occurring host gene cluster so that each colony in thelibrary produces a different PKS and ultimately a different polyketide.The number of different polyketides that are produced by the library istypically at least four, more typically at least ten, and preferably atleast 20, more preferably at least 50, reflecting similar numbers ofdifferent altered PKS gene clusters and PKS gene products. The number ofmembers in the library is arbitrarily chosen; however, the degrees offreedom outlined above with respect to the variation of starter,extender units, stereochemistry, oxidation state, and chain length isquite large. The polyketide producing colonies can be identified andisolated using known techniques and the produced polyketides furthercharacterized. The polyketides produced by these colonies can be usedcollectively in a panel to represent a library or may be assessedindividually for activity.

Colonies in the library are induced to produce the relevant synthasesand thus to produce the relevant polyketides to obtain a library ofcandidate polyketides. The polyketides secreted into the media can bescreened for binding to desired targets, such as receptors, signalingproteins, and the like. The supernatants per se can be used forscreening, or partial or complete purification of the polyketides canfirst be effected. Typically, such screening methods involve detectingthe binding of each member of the library to receptor or other targetligand. Binding can be detected either directly or through a competitionassay. Means to screen such libraries for binding are well known in theart. Alternatively, individual polyketide members of the library can betested against a desired target. In this event, screens wherein thebiological response of the target is measured can be included.

As noted above, the DNA compounds of the invention can be expressed inhost cells for production of proteins and of known and novel compounds.Preferred hosts include fungal systems such as yeast and procaryotichosts, but single cell cultures of, for example, mammalian cells couldalso be used. A variety of methods for heterologous expression of PKSgenes and host cells suitable for expression of these genes andproduction of polyketides are described, for example, in U.S. Pat. Nos.5,843,718 and 5,830,750; WO 01/31035, WO 01/27306, and WO 02/068613; andU.S. patent application Ser. Nos. 10/087,451 (published asUS2002000087451); 60/355,211; and 60/396,513 (corresponding to publishedapplication 20020045220).

Appropriate host cells for the expression of the hybrid PKS genesinclude those organisms capable of producing the needed precursors, suchas malonyl-CoA, methylmalonyl-CoA, ethylmalonyl-CoA, andmethoxymalonyl-ACP, and having phosphopantotheinylation systems capableof activating the ACP domains of modular PKSs. See, for example, U.S.Pat. No. 6,579,695. However, as disclosed in U.S. Pat. No. 6,033,883, awide variety of hosts can be used, even though some hosts natively donot contain the appropriate post-translational mechanisms to activatethe acyl carrier proteins of the synthases. Also see WO 97/13845 and WO98/27203. The host cell may natively produce none, some, or all of therequired polyketide precursors, and may be genetically engineered so asto produce the required polyketide precursors. Such hosts can bemodified with the appropriate recombinant enzymes to effect thesemodifications. Suitable host cells include Streptomyces, E. coli, yeast,and other procaryotic hosts which use control sequences compatible withStreptomyces spp. Examples of suitable hosts that either nativelyproduce modular polyketides or have been engineered so as to producemodular polyketides include but are not limited to actinomyctes such asStreptomyces coelicolor, Streptomyces venezuelae, Streptomyces fradiae,Streptomyces ambofaciens, and Saccharopolyspora erythraea, eubacteriasuch as Escherichia coli, myxobacteria such as Myxococcus xanthus, andyeasts such as Saccharomyces cerevisiae.

In one embodiment, any native modular PKS genes in the host cell havebeen deleted to produce a “clean host,” as described in U.S. Pat. No.5,672,491, incorporated herein by reference.

In some embodiments, the host cell expresses, or is engineered toexpress, a polyketide “tailoring” or “modifying” enzyme. Once a PKSproduct is released, it is subject to post-PKS tailoring reactions.These reactions are important for biological activity and for thediversity seen among polyketides. Tailoring enzymes normally associatedwith polyketide biosynthesis include oxygenases, glycosyl- andmethyl-transferases, acyltransferases, halogenases, cyclases,aminotransferases, and hydroxylases. In addition to biosyntheticaccessory activities, secondary metabolite clusters often code foractivities such as transport.

Tailoring enzymes for modification of a product of the jerangolid PKS, anon-jerangolid PKS, or a chimeric PKS, can be those normally associatedwith jerangolid biosynthesis or “heterologous” tailoring enzymes.Tailoring enzymes can be expressed in the organism in which they arenaturally produced, or as recombinant proteins in heterologous hosts. Insome cases, the structure produced by the heterologous or hybrid PKS maybe modified with different efficiencies by post-PKS tailoring enzymesfrom different sources. In such cases, post-PKS tailoring enzymes can berecruited from other pathways to obtain the desired compound. Forexample, the tailoring enzymes of the jerangolid PKS gene cluster can beexpressed heterologously to modify polyketides produced bynon-jerangolid synthases or can be inactivated in the Jerangolidproducer. Alternatively, the unmodified polyketide compounds can beproduced in the recombinant host cell, and the desired modification(e.g., oxidation) steps carried out in vitro (e.g., using purifiedenzymes, isolated from native sources or recombinantly produced) or invivo in a converting cell different from the host cell (e.g., bysupplying the converting cell with the unmodified polyketide).

It will be apparent to one of skill in the art that a variety ofrecombinant vectors can be utilized in the practice of aspects of theinvention. As used herein, “vector” refers to polynucleotide elementsthat are used to introduce recombinant nucleic acid into cells foreither expression or replication. Selection and use of such vehicles isroutine in the art. An “expression vector” includes vectors capable ofexpressing DNAs that are operatively linked with regulatory sequences,such as promoter regions. Thus, an expression vector refers to arecombinant DNA or RNA construct, such as a plasmid, a phage,recombinant virus or other vector that, upon introduction into anappropriate host cell, results in expression of the cloned DNA.Appropriate expression vectors are well known to those of skill in theart and include those that are replicable in eukaryotic cells and/orprokaryotic cells and those that remain episomal or those that integrateinto the host cell genome.

The vectors used to perform the various operations to replace theenzymatic activity in the host PKS genes or to support mutations inthese regions of the host PKS genes may be chosen to contain controlsequences operably linked to the resulting coding sequences in a mannerthat expression of the coding sequences may be effected in anappropriate host. Suitable control sequences include those that functionin eucaryotic and procaryotic host cells. If the cloning vectorsemployed to obtain PKS genes encoding derived PKS lack control sequencesfor expression operably linked to the encoding nucleotide sequences, thenucleotide sequences are inserted into appropriate expression vectors.This can be done individually, or using a pool of isolated encodingnucleotide sequences, which can be inserted into host vectors, theresulting vectors transformed or transfected into host cells, and theresulting cells plated out into individual colonies.

Suitable control sequences for single cell cultures of various types oforganisms are well known in the art. Control systems for expression inyeast are widely available and are routinely used. Control elementsinclude promoters, optionally containing operator sequences, and otherelements depending on the nature of the host, such as ribosome bindingsites. Particularly useful promoters for procaryotic hosts include thosefrom PKS gene clusters that result in the production of polyketides assecondary metabolites, including those from Type I or aromatic (Type II)PKS gene clusters. Examples are act promoters, tcm promoters, spiramycinpromoters, and the like. However, other bacterial promoters, such asthose derived from sugar metabolizing enzymes, such as galactose,lactose (lac) and maltose, are also useful. Additional examples includepromoters derived from biosynthetic enzymes such as for tryptophan(trp), the β-lactamase (bla), bacteriophage lambda PL, and T5. Inaddition, synthetic promoters, such as the tac promoter (U.S. Pat. No.4,551,433), can be used.

As noted, particularly useful control sequences are those whichthemselves, or with suitable regulatory systems, activate expressionduring transition from growth to stationary phase in the vegetativemycelium. The system contained in the plasmid identified as pCK7, i.e.,the actI/actIII promoter pair and the actII-ORF4 (an activator gene), isparticularly preferred. Particularly preferred hosts are those that lacktheir own means for producing polyketides so that a cleaner result isobtained. Illustrative control sequences, vectors, and host cells ofthese types include the modified S. coelicolor CH999 and vectorsdescribed in PCT publication WO 96/40968 and similar strains of S.lividans. See U.S. Pat. Nos. 5,672,491; 5,830,750, 5,843,718; and6,177,262, each of which is incorporated herein by reference.

Other regulatory sequences may also be desirable which allow forregulation of expression of the PKS sequences relative to the growth ofthe host cell. Regulatory sequences are known to those of skill in theart, and examples include those which cause the expression of a gene tobe turned on or off in response to a chemical or physical stimulus,including the presence of a regulatory compound. Other types ofregulatory elements may also be present in the vector, for example,enhancer sequences. Selectable markers can also be included in therecombinant expression vectors. A variety of markers are known which areuseful in selecting for transformed cell lines and generally comprise agene whose expression confers a selectable phenotype on transformedcells when the cells are grown in an appropriate selective medium. Suchmarkers include, for example, genes that confer antibiotic resistance orsensitivity to the plasmid. Alternatively, several polyketides arenaturally colored, and this characteristic provides a built-in markerfor screening cells successfully transformed by the present constructs.

The various PKS nucleotide sequences, or a mixture of such sequences,can be cloned into one or more recombinant vectors as individualcassettes, with separate control elements or under the control of asingle promoter. The PKS subunits or components can include flankingrestriction sites to allow for the easy deletion and insertion of otherPKS subunits so that hybrid or chimeric PKSs can be generated. Thedesign of such restriction sites is known to those of skill in the artand can be accomplished using the techniques described above, such assite-directed mutagenesis and PCR. Methods for introducing therecombinant vectors of the present invention into suitable hosts areknown to those of skill in the art and typically include the use ofCaCl₂ or other agents, such as divalent cations, lipofection, DMSO,protoplast transformation, conjugation, and electroporation.

Thus, the present invention provides recombinant DNA molecules andvectors comprising those recombinant DNA molecules that encode at leasta portion of the jerangolid PKS and that, when transformed into a hostcell and the host cell is cultured under conditions that lead to theexpression of said jerangolid PKS enzymes, results in the production ofpolyketides including but not limited to jerangolid and/or analogs orderivatives thereof in useful quantities. The present invention alsoprovides recombinant host cells comprising those recombinant vectors.

Suitable culture conditions for production of polyketides using thecells of the invention will vary according to the host cell and thenature of the polyketide being produced, but will be know to those ofskill in the art. See, for example, the examples below and WO 98/27203“Production of Polyketides in Bacteria and Yeast” and WO 01/83803“Overproduction Hosts For Biosynthesis of Polyketides.”

The polyketide product produced by host cells of the invention can berecovered (i.e., separated from the producing cells and at leastpartially purified) using routine techniques (e.g., extraction frombroth followed by chromatography).

The compositions, cells and methods of the invention may be directed tothe preparation of an individual polyketide or a number of polyketides.The polyketide may or may not be novel, but the method of preparationpermits a more convenient or alternative method of preparing it.

The following Examples are intended to illustrate, but not limit, thescope of the invention.

EXAMPLE 1 Isolation of Jerangolid PKS Cosmids

Genomic DNA was isolated from Sorangium cellulosum Soce307, the producerof jerangolid using an established protocol (Jaoua, S., Neff, S., andSchupp, T. “Transfer of mobilizable plasmids to Sorangium cellulosum andevidence for their integration into the chromosome,” 1992 Plasmid28:157-165). The DNA was partially digested with Sau3AI using a serialdilution method and libraries were constructed in SuperKOS (a smallerderivative of SuperCos-1) using the protocol for SuperCos-1 fromStratagene. Colonies were picked, cosmid DNA was isolated on the Qiagenrobot, and the DNA was submitted for end sequencing. The data wasanalyzed by BLAST and all PKS positive cosmids were prepared in largeramounts for further analysis.

End sequencing of cosmid and fosmid libraries of the Soce307 genome gave13 cosmids with PKS sequence on at least one end. Five of thesecosmid/fosmid end sequences were highly similar (>92% identity at thenucleotide level) to sequence from the ambruticin PKS, disclosed inco-pending U.S. application Ser. No. 60/551,103, filed 2 Mar. 2004 andincorporated herein by reference in its entirety, indicating theyprobably contain the jerangolid cluster.

All publications and patent documents cited herein are incorporatedherein by reference as if each such publication or document wasspecifically and individually indicated to be incorporated herein byreference.

Although the present invention has been described in detail withreference to specific embodiments, those of skill in the art willrecognize that modifications and improvements are within the scope andspirit of the invention. Citation of publications and patent documentsis not intended as an admission that any such document is pertinentprior art, nor does it constitute any admission as to the contents ordate of the same. The invention having now been described by way ofwritten description, those of skill in the art will recognize that theinvention can be practiced in a variety of embodiments and that theforegoing description are for purposes of illustration and notlimitation of the following claims.

1. A purified or recombinant nucleic acid comprising a nucleotidesequence that encodes at least one polypeptide required for thebiosynthesis of jerangolid, wherein the complement of said nucleotidesequence hybridizes to a sequence selected from the group consisting ofnucleotides 1-67323 of SEQ ID NO:1, under conditions of hybridization at65° C. for 36 hours and washing 3 times at high stringency with 0.1×SSCand 0.5% SDS for 20 minutes at 65° C.
 2. A purified or recombinantnucleic acid a nucleotide sequence that encodes at least one module ofthe jerangolid polyketide synthase, wherein the complement of saidnucleotide sequence hybridizes to a sequence selected from the groupconsisting of nucleotides that encode modules of the jerangolid PKS aslisted in Table
 1. 3. A purified or recombinant nucleic acid accordingto claim 1, wherein said polypeptide comprises a β-ketoacylsynthasedomain and wherein the complement of said nucleotide sequence hybridizesto a sequence selected from the group consisting of β-ketoacylsynthasedomains as listed in Table 1, under conditions of hybridization at 65°C. for 36 hours and washing 3 times at high stringency with 0.1×SSC and0.5% SDS for 20 minutes at 65° C.
 4. A purified or recombinant nucleicacid according to claim 1, wherein said polypeptide comprises anacyltransferase domain and wherein the complement of said nucleotidesequence hybridizes to a sequence selected from the group consisting ofacyltransferase domains as listed in Table 1, under conditions ofhybridization at 65° C. for 36 hours and washing 3 times at highstringency with 0.1×SSC and 0.5% SDS for 20 minutes at 65° C.
 5. Apurified or recombinant nucleic acid according to claim 1, wherein saidpolypeptide comprises a β-ketoreductase domain and wherein thecomplement of said nucleotide sequence hybridizes to a sequence selectedfrom the group consisting of β-ketoreductase domains as listed in Table1, under conditions of hybridization at 65° C. for 36 hours and washing3 times at high stringency with 0.1×SSC and 0.5% SDS for 20 minutes at65° C.
 6. A purified or recombinant nucleic acid according to claim 1,wherein said polypeptide comprises a dehydratase domain and wherein thecomplement of said nucleotide sequence hybridizes to a sequence selectedfrom the group consisting of dehydratase domains as listed in Table 1,under conditions of hybridization at 65° C. for 36 hours and washing 3times at high stringency with 0.1×SSC and 0.5% SDS for 20 minutes at 65°C.
 7. A purified or recombinant nucleic acid according to claim 1,wherein said polypeptide comprises an enoylreductase domain and whereinthe complement of said nucleotide sequence hybridizes to enoylreductasedomains as listed in Table 1, under conditions of hybridization at 65°C. for 36 hours and washing 3 times at high stringency with 0.1×SSC and0.5% SDS for 20 minutes at 65° C.
 8. A purified or recombinant nucleicacid according to claim 1, wherein said polypeptide comprises an acylcarrier protein domain and wherein the complement of said nucleotidesequence hybridizes to a sequence selected from the group consisting ofacyl carrier protein domains as listed in Table 1, under conditions ofhybridization at 65° C. for 36 hours and washing 3 times at highstringency with 0.1×SSC and 0.5% SDS for 20 minutes at 65° C.
 9. Apurified or recombinant polypeptide involved in the biosynthesis of anjerangolid, wherein said polypeptide has an amino acid sequence that canbe encoded by a nucleic acid sequence of claim
 1. 10. The polypeptide ofclaim 9 that can be encoded by the gene jerA.
 11. The polypeptide ofclaim 9 that can be encoded by the gene jerB.
 12. The polypeptide ofclaim 9 that can be encoded by the gene jerC.
 13. The polypeptide ofclaim 9 that can encoded by the gene jerD.
 14. The polypeptide of claim9 that can be encoded by the gene jerE.
 15. The polypeptide of claim 9that can be encoded by the gene jerF.
 16. A method of making anjerangolid or jerangolid analog, said method comprising expressing atleast one recombinant gene of claim 1 in a host cell capable ofproducing polyketides.